Kansas Economic Data
  • Home
  • About

H-1B Visas in Kansas

Select measures of H-1B activity in Kansas.

  • toc: true
  • author: Nigel Soria
  • badges: true
  • categories: [immigration, seasonal, migrant workers, H-1B]
  • permalink: /h-1b/
  • image: images/h1b-bar.png

Definitions

  • Fiscal Year: The fiscal year in which U.S. Citizenship and Immigration Services (USCIS) first recorded an approval or denial in the electronic systems. USCIS follows the U.S. Federal Government fiscal year (FY) calendar, so data sets presented by fiscal year cover Oct. 1 of one year to Sept. 30 of the next year. The applications and petitions USCIS receives on a given date are generally adjudicated on a later date. Therefore, data reflect the date USCIS adjudicated the application or petition, rather than the date received.

  • Initial Approval: H-1B petitions with “new employment” or “new concurrent employment” selected on Part 2, Question 2 of the Form I-129 whose first decision is an approval.

  • Initial Denial: H-1B petitions with “new employment” or “new concurrent employment” selected on Part 2, Question 2 of the Form I-129 whose first decision is a denial.

  • Continuing Approval: H-1B petitions with anything other than “new employment” or “new concurrent employment” selected on Part 2, Question 2 of the Form I-129, whose first decision is an approval. This includes, for example, continuing employment, change of employer, and amended petitions.

  • Continuing Denial: H-1B petitions with anything other than “new employment” or “new concurrent employment” selected on Part 2, Question 2 of the Form I-129 whose first decision is a denial. This includes, for example, continuing employment, change of employer, and amended petitions.

  • NAICS: North American Industry Classification System Code, a character string that stands for an industry classification within the North American Industry Classification System.

Note: The data and the above definitions come from the U.S. Citizenship and Immigration Services.

#hide

import boto3
import dask
import dask.dataframe as dd
import dask.bag as db
from dask.delayed import delayed
import datetime as dt
import json
import matplotlib as mpl
import numpy as np
import os
import pandas as pd
from pathlib import Path
import plotly.graph_objects as go
from urllib.request import urlopen
from IPython.display import Image
from IPython.display import HTML
import time
#hide

# file_path = '../data/uscis/h1b_datahubexport-*.csv'
# url = 'https://www.uscis.gov/sites/default/files/document/data/h1b_datahubexport-*.csv'

# df = dd.read_csv(file_path, dtype=str)

# df = df[df['State']=='KS']

# convert_dict = {
#     'Initial Approvals': int, 'Initial Denials': int, 
#     'Continuing Approvals': int, 'Continuing Denials': int,
#     'NAICS': str, 'ZIP': str
# }
# df = df.astype(convert_dict)

# df = df.compute()

# df['total_approvals'] = df['Initial Approvals'] + df['Continuing Approvals']
# df['total_denials'] = df['Initial Denials'] + df['Continuing Denials']
# df['total_petitions'] = df['total_approvals'] + df['total_denials']

# naics = pd.read_csv('../data/uscis/naics-list.csv', dtype=str)
# df = pd.merge(df, naics, how='left', left_on='NAICS', right_on='sector')

# zips = pd.read_csv('../data/uscis/us-zips.csv', dtype=str)
# df = pd.merge(df, zips[['zip', 'county_name', 'county_fips']], how='left', left_on='ZIP', right_on='zip')

# columns = [
#     'Fiscal Year', 'Employer', 'Initial Approvals', 
#     'Initial Denials', 'Continuing Approvals', 'Continuing Denials', 
#     'total_approvals', 'total_denials', 'total_petitions',
#     'sector', 'description', 'State', 'zip', 'county_name', 'county_fips'
# ]

# names = [
#     'fiscal_year', 'employer', 'initial_approvals', 
#     'initial_denials', 'continuing_approvals', 'continuing_denials', 
#     'total_approvals', 'total_denials', 'total_petitions',
#     'sector', 'description', 'state', 'zip', 'county_name', 'county_fips'
# ]

# name_dict = {columns[i]: names[i] for i in range(len(columns))} 

# df = df[columns]
# df.rename(columns = name_dict, inplace = True) 

# df.to_csv('../data/uscis/ks-h1b-data.csv', index=False)

df = pd.read_csv('data/ks-h1b-data.csv', dtype=str)

convert_dict = {
    'initial_approvals': int, 'initial_denials': int, 
    'continuing_approvals': int, 'continuing_denials': int,
    'total_approvals': int, 'total_denials': int, 'total_petitions': int,
    'sector': str, 'zip': str
}
df = df.astype(convert_dict)
#hide

ks_gold ='#F1AD02'
ks_blue ='#002569'
n=500

def color_scale(c1, c2, n): #fade (linear interpolate) from color c1 (at mix=0) to c2 (mix=1)
    colors = []
    c1 = np.array(mpl.colors.to_rgb(c1))
    c2 = np.array(mpl.colors.to_rgb(c2))

    for x in range(n+1):
        color = mpl.colors.to_hex((1-x/n)*c1 + x/n*c2)
        colors.append(color)

    return colors

Trends in H-1B Approvals in Kansas, FY 2009-2019

The metrics in the graph below come from the following calculations:

\[\begin{align} \text{Total Approvals} &= \text{Initial Approvals} + \text{Continuing Approvals} \\ \text{Total Petitions} &= \text{Initial Approvals} + \text{Continuing Approvals} + \text{Initial Denials} + \text{Continuing Denials} \end{align}\]

#hide

df_line_bar = df.groupby('fiscal_year').agg({'total_approvals':'sum', 'total_petitions':'sum'}).reset_index()

fig = go.Figure()

fig.add_trace(
    go.Scatter(
        x = df_line_bar['fiscal_year'],
        y = df_line_bar['total_approvals'],
        marker_color = 'rgb(241, 173, 2)',
        name = 'Total Approvals',
        yaxis = 'y'
    ),
)

fig.add_trace(
    go.Bar(
        x = df_line_bar['fiscal_year'],
        y = df_line_bar['total_petitions'],
        marker_color = 'rgb(0, 37, 105)',
        name = 'Total Petitions',
        yaxis = 'y'
    ),
)

fig.update_layout(
#     title={
#         'text': '<b>Trends in H-1B Approvals in Kansas, FY 2009-2019</b>',
#         'y':0.9,
#         'x':0.5,
#         'xanchor': 'center',
#         'yanchor': 'top'
#     },
    xaxis=dict(
        showline=True,
        showgrid=False,
        zeroline=True,
        showticklabels=True,
        linecolor='rgb(0, 0, 0)',
        dtick=1.0,
        tickfont=dict(
            family='Arial',
            size=12,
            color='rgb(82, 82, 82)',
        ),
    ),
    yaxis=dict(
#         title={
#             'text': '<b>Total Approvals</b>',
#             'font': {'color': 'rgb(241, 173, 2)'
#             }
#         },
        showgrid=True,
        zeroline=True,
        showline=True,
        linecolor='rgb(0, 0, 0)',
        overlaying='y2',
        gridcolor='rgb(200, 200, 200)',
        rangemode='tozero',
#         range=[0,70000],
        anchor='x',
        separatethousands = True,
        side='left'
    ),
    yaxis2=dict(
#         title={
#             'text': '<b>Total Petitions</b>',
#             'font': {'color': 'rgb(0, 37, 105)'
#             }
#         },
        zeroline=True,
        showline=True,
        linecolor='rgb(0, 0, 0)',
        side='right',
        rangemode = 'tozero',
#         range=[0,140000],
        anchor='x',
        separatethousands = True,
    ),
    xaxis_tickangle=-45,
    showlegend=True,
    plot_bgcolor='white',
)

fig.show()
#hide_input

HTML(fig.to_html(include_plotlyjs='cdn'))

Total H-1B Approvals in FY 2019, by County

Hover your mouse over a county to reveals the county’s name, the number of initial approvals, and the number of continuing approvals. The coloring on the map is based on the total approvals, which is the sum of intial approvals and continuing approvals.

#hide

with urlopen('https://raw.githubusercontent.com/plotly/datasets/master/geojson-counties-fips.json') as response:
        counties = json.load(response)

df_map = df.groupby(['county_name', 'county_fips']).agg({
    'initial_approvals':'sum', 'continuing_approvals':'sum', 'total_approvals':'sum'
}).reset_index()

colors = color_scale(ks_gold, ks_blue, n)

fig = go.Figure(go.Choroplethmapbox(geojson=counties, locations=df_map['county_fips'], z=df_map['total_approvals'].astype(str),
                                    colorscale=colors, zmin=df_map['total_approvals'].min(), zmax=df_map['total_approvals'].max(),
                                   marker_opacity=0.5, marker_line_width=0,
                                   text= df_map['county_name'].str.capitalize() + ' County'
                                   '<br>Initial Approvals: ' + df_map['initial_approvals'].astype(int).map(lambda x: '{:,}'.format(x)).astype(str) +
                                   '<br>Continuing Approvals: ' + df_map['continuing_approvals'].astype(int).map(lambda x: '{:,}'.format(x)).astype(str),
                                   hoverinfo='text',
                                   colorbar={'title':'Total Approvals', 'separatethousands':True}
                                   ))
fig.update_layout(mapbox_style="carto-positron",
                  mapbox_zoom=6, mapbox_center = {"lat": 38.863379, "lon": -98.234060})
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
# fig.update_layout(legend_title='<b> Total Approvals </b>')
# fig.update_layout(
#     title={
#     'text': '<b>COVID-19 Cases by County</b>',
#     'y':0.95,
#     'x':0.5,
#     'xanchor': 'center',
#     'yanchor': 'top'
#     },
# )

fig.show()
#hide_input

HTML(fig.to_html(include_plotlyjs='cdn'))

Initial and Continuing H-1B Approvals in Fiscal Year 2019, by Industry (Top 5)

#hide

df_ind = df[df['fiscal_year']=='2019']
df_ind = df_ind.groupby('description').agg({'initial_approvals':'sum', 'continuing_approvals':'sum'}).reset_index()
df_ind['total_approvals'] = df_ind['initial_approvals'] + df_ind['continuing_approvals']
df_ind = df_ind[df_ind['total_approvals']>=100]

fig = go.Figure()

fig.add_trace(go.Bar(
    x = df_ind['initial_approvals'],
    y = df_ind['description'],
    name='Initial Approvals',
    orientation='h',
    marker=dict(
        color='rgba(20,37,105, 0.6)',
        line=dict(color='rgba(0,37,105, 1.0)', width=3)
    )
))
fig.add_trace(go.Bar(
    x = df_ind['continuing_approvals'],
    y = df_ind['description'],
    name='Continuing Approvals',
    orientation='h',
    marker=dict(
        color='rgba(241, 173, 2, 0.6)',
        line=dict(color='rgba(241, 173, 2, 1.0)', width=3)
    )
))

fig.update_yaxes(
    showgrid=True,
    zeroline=True,
    showline=True,
#     dtick=1,
    linecolor='rgb(0, 0, 0)'
)
fig.update_xaxes(
    showgrid=True,
    zeroline=True,
#     tickformat=',.0%',
    showline=True,
#     range=[0,.25],
#     dtick=.05,
    linecolor='rgb(0, 0, 0)')

fig.update_layout(
#     title={
#     'text': '<b>Initial and Continuing H-1B Approvals in Fiscal Year 2019, by Industry (Top 5)</b>',
#     'y':0.95,
#     'x':0.5,
#     'xanchor': 'center',
#     'yanchor': 'top'
#     },
    yaxis={'categoryorder':'total ascending'},
    plot_bgcolor='white',
    barmode='stack'
)

fig.show()
# fig.write_image('h1b-bar.png')
#hide_input

HTML(fig.to_html(include_plotlyjs='cdn'))
  • GitHub Repo
  • ™